--- title: IceVision Bboxes - Real Data keywords: fastai sidebar: home_sidebar nb_path: "nbs/IceVision-on-espiownage-cleaner.ipynb" ---
{% raw %}
{% endraw %}

This is a mashup of IceVision's "Custom Parser" example and their "Getting Started (Object Detection)" notebooks, to analyze SPNet Real dataset, for which I generated bounding boxes. -- S.H. Hawley, July 1, 2021

Installing IceVision and IceData

If on Colab run the following cell, else check the installation instructions

{% raw %}
 
{% endraw %} {% raw %}
#try:
#    !wget https://raw.githubusercontent.com/airctic/icevision/master/install_colab.sh
#    !chmod +x install_colab.sh && ./install_colab.sh
#except:
#    print("Ignore the error messages and just keep going")
{% endraw %} {% raw %}
# probably want 
# pip3 install torch==1.8.2+cu111 torchvision==0.9.2+cu111 torchaudio==0.8.2 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
import torch, re 
tv, cv = torch.__version__, torch.version.cuda
tv = re.sub('\+cu.*','',tv)
TORCH_VERSION = 'torch'+tv[0:-1]+'0'
CUDA_VERSION = 'cu'+cv.replace('.','')

print(f"TORCH_VERSION={TORCH_VERSION}; CUDA_VERSION={CUDA_VERSION}")
print(torch.__version__)
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.cuda.current_device())
print(torch.cuda.get_device_name())
TORCH_VERSION=torch1.8.0; CUDA_VERSION=cu102
1.8.1+cu102
True
1
0
NVIDIA GeForce RTX 2070 with Max-Q Design
{% endraw %} {% raw %}
#!pip install -q mmcv-full=="1.3.8" -f https://download.openmmlab.com/mmcv/dist/{CUDA_VERSION}/{TORCH_VERSION}/index.html --upgrade
#!pip install -q mmdet
{% endraw %}

Imports

As always, let's import everything from icevision. Additionally, we will also need pandas (you might need to install it with pip install pandas).

{% raw %}
from icevision.all import *
import pandas as pd
INFO     - The mmdet config folder already exists. No need to downloaded it. Path : /home/shawley/.icevision/mmdetection_configs/mmdetection_configs-2.10.0/configs | icevision.models.mmdet.download_configs:download_mmdet_configs:17
{% endraw %}

Download dataset

We're going to be using a small sample of the chess dataset, the full dataset is offered by roboflow here

{% raw %}
#data_dir = icedata.load_data(data_url, 'chess_sample') / 'chess_sample-master'

# OLD SPNET Real Dataset link (currently proprietary, thus link may not work)
#data_url = "https://hedges.belmont.edu/~shawley/spnet_sample-master.zip"
#data_dir = icedata.load_data(data_url, 'spnet_sample') / 'spnet_sample-master' 

# espiownage cyclegan dataset: cyclegan is public for demo / reproducibility
#data_url = 'https://hedges.belmont.edu/~shawley/espiownage-cyclegan.tgz'
#data_dir = icedata.load_data(data_url, 'espiownage-cyclegan') / 'espiownage-cyclegan'

from pathlib import Path
data_dir = Path('/home/shawley/datasets/espiownage-cleaner') # real data is local and private.
{% endraw %}

Understand the data format

In this task we were given a .csv file with annotations, let's take a look at that.

!!! danger "Important"
Replace source with your own path for the dataset directory.

{% raw %}
df = pd.read_csv(data_dir / "bboxes/annotations.csv")
df.head()
filename width height label xmin ymin xmax ymax
0 06240907_proc_00254.png 512 384 1 31 135 184 290
1 06240907_proc_00256.png 512 384 0 65 153 168 270
2 06240907_proc_00270.png 512 384 1 45 149 164 280
3 06240907_proc_00281.png 512 384 10 0 111 185 340
4 06240907_proc_00281.png 512 384 1 254 134 353 215
{% endraw %}

At first glance, we can make the following assumptions:

  • Multiple rows with the same filename, width, height
  • A label for each row
  • A bbox [xmin, ymin, xmax, ymax] for each row

Once we know what our data provides we can create our custom Parser.

{% raw %}
df['label'] = 'A'  # antinode
df.head()
filename width height label xmin ymin xmax ymax
0 06240907_proc_00254.png 512 384 A 31 135 184 290
1 06240907_proc_00256.png 512 384 A 65 153 168 270
2 06240907_proc_00270.png 512 384 A 45 149 164 280
3 06240907_proc_00281.png 512 384 A 0 111 185 340
4 06240907_proc_00281.png 512 384 A 254 134 353 215
{% endraw %}

Create the Parser

The first step is to create a template record for our specific type of dataset, in this case we're doing standard object detection:

{% raw %}
template_record = ObjectDetectionRecord()
{% endraw %}

Now use the method generate_template that will print out all the necessary steps we have to implement.

{% raw %}
Parser.generate_template(template_record)
class MyParser(Parser):
    def __init__(self, template_record):
        super().__init__(template_record=template_record)
    def __iter__(self) -> Any:
    def __len__(self) -> int:
    def record_id(self, o: Any) -> Hashable:
    def parse_fields(self, o: Any, record: BaseRecord, is_new: bool):
        record.set_img_size(<ImgSize>)
        record.set_filepath(<Union[str, Path]>)
        record.detection.add_bboxes(<Sequence[BBox]>)
        record.detection.set_class_map(<ClassMap>)
        record.detection.add_labels(<Sequence[Hashable]>)
{% endraw %}

We can copy the template and use it as our starting point. Let's go over each of the methods we have to define:

  • __init__: What happens here is completely up to you, normally we have to pass some reference to our data, data_dir in our case.

  • __iter__: This tells our parser how to iterate over our data, each item returned here will be passed to parse_fields as o. In our case we call df.itertuples to iterate over all df rows.

  • __len__: How many items will be iterating over.

  • imageid: Should return a Hashable (int, str, etc). In our case we want all the dataset items that have the same filename to be unified in the same record.

  • parse_fields: Here is where the attributes of the record are collected, the template will suggest what methods we need to call on the record and what parameters it expects. The parameter o it receives is the item returned by __iter__.

!!! danger "Important"
Be sure to pass the correct type on all record methods!

{% raw %}
# but currently not a priority!
class ChessParser(Parser):
    def __init__(self, template_record, data_dir):
        super().__init__(template_record=template_record)
        
        self.data_dir = data_dir
        self.df = pd.read_csv(data_dir / "bboxes/annotations.csv")
        self.df['label'] = 'A'  # make them all the same object
        self.class_map = ClassMap(list(self.df['label'].unique()))
        
    def __iter__(self) -> Any:
        for o in self.df.itertuples():
            yield o
        
    def __len__(self) -> int:
        return len(self.df)
        
    def record_id(self, o) -> Hashable:
        return o.filename
        
    def parse_fields(self, o, record, is_new):
        if is_new:
            record.set_filepath(self.data_dir / 'images' / o.filename)
            record.set_img_size(ImgSize(width=o.width, height=o.height))
            record.detection.set_class_map(self.class_map)
        
        record.detection.add_bboxes([BBox.from_xyxy(o.xmin, o.ymin, o.xmax, o.ymax)])
        record.detection.add_labels([o.label])
{% endraw %}

Let's randomly split the data and parser with Parser.parse:

{% raw %}
parser = ChessParser(template_record, data_dir)
{% endraw %} {% raw %}
train_records, valid_records = parser.parse()
INFO     - Autofixing records | icevision.parsers.parser:parse:136
{% endraw %}

Let's take a look at one record:

{% raw %}
show_record(train_records[5], display_label=False, figsize=(14, 10))
{% endraw %} {% raw %}
train_records[0]
BaseRecord

common: 
	- Image size ImgSize(width=512, height=384)
	- Filepath: /home/shawley/datasets/espiownage-cleaner/images/06240907_proc_01386.png
	- Img: None
	- Record ID: 657
detection: 
	- BBoxes: [<BBox (xmin:219, ymin:124, xmax:362, ymax:265)>, <BBox (xmin:228, ymin:11, xmax:271, ymax:84)>, <BBox (xmin:285, ymin:22, xmax:328, ymax:93)>, <BBox (xmin:0, ymin:106, xmax:167, ymax:321)>, <BBox (xmin:260, ymin:286, xmax:360, ymax:384)>]
	- Class Map: <ClassMap: {'background': 0, 'A': 1}>
	- Labels: [1, 1, 1, 1, 1]
{% endraw %}

Moving On...

Following the Getting Started "refrigerator" notebook...

{% raw %}
# size is set to 384 because EfficientDet requires its inputs to be divisible by 128
image_size = 384  
train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=image_size, presize=512), tfms.A.Normalize()])
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(image_size), tfms.A.Normalize()])

# Datasets
train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)
{% endraw %}

this next cell generates an error. ignore it and move on

{% raw %}
samples = [train_ds[0] for _ in range(3)]
show_samples(samples, ncols=3)
{% endraw %} {% raw %}
model_type = models.mmdet.retinanet
backbone = model_type.backbones.resnet50_fpn_1x(pretrained=True)
{% endraw %} {% raw %}
selection = 0


extra_args = {}

if selection == 0:
  model_type = models.mmdet.retinanet
  backbone = model_type.backbones.resnet50_fpn_1x

elif selection == 1:
  # The Retinanet model is also implemented in the torchvision library
  model_type = models.torchvision.retinanet
  backbone = model_type.backbones.resnet50_fpn

elif selection == 2:
  model_type = models.ross.efficientdet
  backbone = model_type.backbones.tf_lite0
  # The efficientdet model requires an img_size parameter
  extra_args['img_size'] = image_size

elif selection == 3:
  model_type = models.ultralytics.yolov5
  backbone = model_type.backbones.small
  # The yolov5 model requires an img_size parameter
  extra_args['img_size'] = image_size

model_type, backbone, extra_args
(<module 'icevision.models.mmdet.models.retinanet' from '/home/shawley/envs/icevision/lib/python3.9/site-packages/icevision/models/mmdet/models/retinanet/__init__.py'>,
 <icevision.models.mmdet.models.retinanet.backbones.resnet_fpn.MMDetRetinanetBackboneConfig at 0x7fe7d223b460>,
 {})
{% endraw %} {% raw %}
model = model_type.model(backbone=backbone(pretrained=True), num_classes=len(parser.class_map), **extra_args) 
/home/shawley/envs/icevision/lib/python3.9/site-packages/mmdet/core/anchor/builder.py:16: UserWarning: ``build_anchor_generator`` would be deprecated soon, please use ``build_prior_generator`` 
  warnings.warn(
Use load_from_local loader
The model and loaded state dict do not match exactly

size mismatch for bbox_head.retina_cls.weight: copying a param with shape torch.Size([720, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([9, 256, 3, 3]).
size mismatch for bbox_head.retina_cls.bias: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([9]).
{% endraw %} {% raw %}
train_dl = model_type.train_dl(train_ds, batch_size=8, num_workers=4, shuffle=True)
valid_dl = model_type.valid_dl(valid_ds, batch_size=8, num_workers=4, shuffle=False)
{% endraw %} {% raw %}
model_type.show_batch(first(valid_dl), ncols=4)
{% endraw %} {% raw %}
metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]
{% endraw %} {% raw %}
learn = model_type.fastai.learner(dls=[train_dl, valid_dl], model=model, metrics=metrics)
{% endraw %} {% raw %}
learn.lr_find(end_lr=0.01)

# For Sparse-RCNN, use lower `end_lr`
# learn.lr_find(end_lr=0.005)
/home/shawley/envs/icevision/lib/python3.9/site-packages/mmdet/core/anchor/anchor_generator.py:324: UserWarning: ``grid_anchors`` would be deprecated soon. Please use ``grid_priors`` 
  warnings.warn('``grid_anchors`` would be deprecated soon. '
/home/shawley/envs/icevision/lib/python3.9/site-packages/mmdet/core/anchor/anchor_generator.py:360: UserWarning: ``single_level_grid_anchors`` would be deprecated soon. Please use ``single_level_grid_priors`` 
  warnings.warn(
SuggestedLRs(lr_min=7.943282253108919e-05, lr_steep=6.30957365501672e-05)
{% endraw %} {% raw %}
learn.fine_tune(60, 1e-4, freeze_epochs=2)
epoch train_loss valid_loss COCOMetric time
0 0.633972 0.500939 0.471098 01:11
1 0.458733 0.405103 0.534495 01:07
epoch train_loss valid_loss COCOMetric time
0 0.399882 0.365127 0.577519 01:17
1 0.383493 0.358106 0.574981 01:18
2 0.356643 0.353600 0.582940 01:17
3 0.350104 0.336664 0.595354 01:16
4 0.349757 0.324829 0.610435 01:16
5 0.343559 0.322749 0.605284 01:16
6 0.328914 0.320818 0.604384 01:16
7 0.334930 0.338464 0.573311 01:16
8 0.325537 0.312643 0.612422 01:15
9 0.328969 0.315080 0.607097 01:15
10 0.325572 0.308966 0.610417 01:15
11 0.316283 0.314663 0.605090 01:17
12 0.312910 0.318825 0.610608 01:15
13 0.312268 0.306723 0.611782 01:15
14 0.306374 0.302953 0.620263 01:15
15 0.312472 0.300711 0.624251 01:15
16 0.298676 0.311334 0.624626 01:15
17 0.304012 0.298789 0.626530 01:15
18 0.299061 0.303556 0.619808 01:15
19 0.297184 0.302024 0.622421 01:15
20 0.291558 0.294137 0.635656 01:15
21 0.296315 0.299329 0.626779 01:15
22 0.280859 0.295419 0.625076 01:15
23 0.276702 0.306997 0.630702 01:16
24 0.279125 0.299035 0.640618 01:16
25 0.278860 0.295171 0.627152 01:17
26 0.276983 0.296291 0.616215 01:15
27 0.272379 0.307114 0.629732 01:16
28 0.264565 0.307630 0.622796 01:15
29 0.265112 0.296105 0.628551 01:15
30 0.260822 0.302779 0.627466 01:15
31 0.260563 0.296989 0.624150 01:15
32 0.258734 0.304162 0.619158 01:15
33 0.254267 0.298195 0.627421 01:15
34 0.247431 0.296636 0.633334 01:15
35 0.254193 0.296944 0.630530 01:14
36 0.250728 0.303455 0.631245 01:16
37 0.248071 0.313657 0.621578 01:15
38 0.246570 0.301215 0.621340 01:16
39 0.242532 0.298918 0.628174 01:16
40 0.242209 0.321441 0.628547 01:16
41 0.243928 0.305226 0.622272 01:16
42 0.235015 0.304557 0.622493 01:15
43 0.243457 0.311997 0.621422 01:15
44 0.240218 0.303820 0.625712 01:15
45 0.234489 0.313346 0.614859 01:16
46 0.229094 0.310520 0.624290 01:15
47 0.232660 0.306830 0.621698 01:16
48 0.240301 0.304431 0.618222 01:15
49 0.223245 0.314579 0.617782 01:15
50 0.233480 0.305044 0.623270 01:15
51 0.225808 0.308000 0.619419 01:15
52 0.228742 0.309542 0.622576 01:15
53 0.228203 0.314097 0.615976 01:15
54 0.220251 0.312148 0.617075 01:15
55 0.234563 0.312603 0.618413 01:15
56 0.228604 0.310252 0.619501 01:15
57 0.222558 0.310638 0.619251 01:15
58 0.235513 0.311517 0.619103 01:16
59 0.221930 0.311747 0.619217 01:15
{% endraw %} {% raw %}
model_type.show_results(model, valid_ds, detection_threshold=.5)
{% endraw %} {% raw %}
learn.save('iv_bbox_real')
learn.load('iv_bbox_real'); 
{% endraw %}

Predictions in bulk

Run through the whole dataset, do predictions on everything, write out bounding boxes, order by top losses

{% raw %}
infer_dl = model_type.infer_dl(infer_ds, batch_size=1)
preds = model_type.predict_from_dl(model=model, infer_dl=infer_dl, keep_images=True)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-32-e6f2ca772421> in <module>
----> 1 infer_dl = model_type.infer_dl(infer_ds, batch_size=1)
      2 preds = model_type.predict_from_dl(model=model, infer_dl=infer_dl, keep_images=True)

NameError: name 'infer_ds' is not defined
{% endraw %} {% raw %}
preds = model_type.predict(model, infer_ds, keep_images=True)
0.00% [0/49 00:00<00:00]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-31-c7c928f7bc40> in <module>
----> 1 preds, targs, losses = learn.get_preds(with_loss=True) # validation set only

~/envs/icevision/lib/python3.9/site-packages/fastai/learner.py in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, inner, reorder, cbs, **kwargs)
    248         if with_loss: ctx_mgrs.append(self.loss_not_reduced())
    249         with ContextManagers(ctx_mgrs):
--> 250             self._do_epoch_validate(dl=dl)
    251             if act is None: act = getattr(self.loss_func, 'activation', noop)
    252             res = cb.all_tensors()

~/envs/icevision/lib/python3.9/site-packages/fastai/learner.py in _do_epoch_validate(self, ds_idx, dl)
    198         if dl is None: dl = self.dls[ds_idx]
    199         self.dl = dl
--> 200         with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
    201 
    202     def _do_epoch(self):

~/envs/icevision/lib/python3.9/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    158 
    159     def _with_events(self, f, event_type, ex, final=noop):
--> 160         try: self(f'before_{event_type}');  f()
    161         except ex: self(f'after_cancel_{event_type}')
    162         self(f'after_{event_type}');  final()

~/envs/icevision/lib/python3.9/site-packages/fastai/learner.py in all_batches(self)
    164     def all_batches(self):
    165         self.n_iter = len(self.dl)
--> 166         for o in enumerate(self.dl): self.one_batch(*o)
    167 
    168     def _do_one_batch(self):

~/envs/icevision/lib/python3.9/site-packages/fastai/learner.py in one_batch(self, i, b)
    189         b = self._set_device(b)
    190         self._split(b)
--> 191         self._with_events(self._do_one_batch, 'batch', CancelBatchException)
    192 
    193     def _do_epoch_train(self):

~/envs/icevision/lib/python3.9/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    158 
    159     def _with_events(self, f, event_type, ex, final=noop):
--> 160         try: self(f'before_{event_type}');  f()
    161         except ex: self(f'after_cancel_{event_type}')
    162         self(f'after_{event_type}');  final()

~/envs/icevision/lib/python3.9/site-packages/fastai/learner.py in _do_one_batch(self)
    170         self('after_pred')
    171         if len(self.yb):
--> 172             self.loss_grad = self.loss_func(self.pred, *self.yb)
    173             self.loss = self.loss_grad.clone()
    174         self('after_loss')

TypeError: loss_fn() got an unexpected keyword argument 'reduction'
{% endraw %} {% raw %}
learn.predict()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-17c52fb00833> in <module>
----> 1 learn.predict()

TypeError: predict() missing 1 required positional argument: 'item'
{% endraw %}

Follow-up:

IceVision forum.